Consistent Logical Checkpointing

نویسنده

  • Nitin H. Vaidya
چکیده

A \consistent checkpointing" algorithm saves a consistent view of the distributed system state on stable storage. The loss of computation upon a failure can be bounded by taking consistent checkpoints with adequate frequency. The traditional consistent checkpointing algorithms require the diierent processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce the overhead. Some techniques for staggering the checkpoints have been proposed previously 9], however, these techniques result in \limited staggering" in that not all processes' checkpoints can be staggered. Ideally, one would like to stagger the checkpoints arbitrarily. This report presents a simple approach to arbitrarily stagger the checkpoints. Our approach requires that the processes take consistent logical checkpoints, as compared to consistent physical checkpoints enforced by existing algorithms. This report discusses the proposed approach and the implementation issues. The proposed approach was discussed brieey in 11].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Staggered Consistent Checkpointing

ÐA consistent checkpointing algorithm saves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpointing algorithms require different processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce c...

متن کامل

Checkpointing yNitin

A consistent checkpointing algorithm saves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpointing algorithms require diierent processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce che...

متن کامل

On Staggered Checkpointing

A consistent checkpointing algorithm saves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpoint-ing algorithms require diierent processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce ch...

متن کامل

Falkirk Wheel: Rollback Recovery for Dataflow Systems

We present a new model for rollback recovery in distributed dataflow systems. We explain existing rollback schemes by assigning a logical time to each event such as a message delivery. If some processors fail during an execution, the system rolls back by selecting a set of logical times for each processor. The effect of events at times within the set is retained or restored from saved state, wh...

متن کامل

Lazy Checkpointing Coordination for Bounding Rollback Propagation

shown that logging a nondeterministic event equivalently places a logical checkpoint [18] at the end of the ensuing In this paper, we propose the technique of lazy checkstate interval, and these extra logical checkpoints serve to point coordination which preserves process autonomy eliminate the domino effect. while employing communication-induced checkpoint coCoordinated checkpointing achieves ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994